Search CORE

53 research outputs found

Learning visual contexts for image annotation from Flickr groups

Author: Breuel T.
Ulges A.
Worring M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

International Migration, Integration and Social Cohesion online publications

Learning visual contexts for image annotation from Flickr groups

Author: Breuel T.
Ulges A.
Worring M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

International Migration, Integration and Social Cohesion online publications

Composition of Constraint, Hypothesis and Error Models to improve interaction in Human-Machine Interfaces

Author: Allauzen
Allauzen
Amengual
B. T. Al Azawi
Bastide
Berghel
Breuel
Brown
Eisner
Farooq
Garcia
Grande
Hall
Hassan
J. Ramon Navarro-Cerdan
Joaquim Arlandis
Juan-Carlos Perez-Cortes
Khaleghi
Llobet
Llobet
Meyer
Mohri
Mohri
Müller
Nelder
Neuhoff
Park
Perez-Cortes
Pérez-Cortes
Rafael Llobet
Raman
Riley
Vidal
Vidal
Publication venue: 'Elsevier BV'
Publication date: 01/05/2016
Field of study

We use Weighted Finite-State Transducers (WFSTs) to represent the different sources of information available: the initial hypotheses, the possible errors, the constraints imposed by the task (interaction language) and the user input. The fusion of these models to find the most probable output string can be performed efficiently by using carefully selected transducer operations. The proposed system initially suggests an output based on the set of hypotheses, possible errors and Constraint Models. Then, if human intervention is needed, a multimodal approach, where the user input is combined with the aforementioned models, is applied to produce, with a minimum user effort, the desired output. This approach offers the practical advantages of a de-coupled model (e.g. input-system + parameterized rules + post-processor), keeping at the same time the error-recovery power of an integrated approach, where all the steps of the process are performed in the same formal machine (as in a typical HMM in speech recognition) to avoid that an error at a given step remains unrecoverable in the subsequent steps. After a presentation of the theoretical basis of the proposed multi-source information system, its application to two real world problems, as an example of the possibilities of this architecture, is addressed. The experimental results obtained demonstrate that a significant user effort can be saved when using the proposed procedure. A simple demonstration, to better understand and evaluate the proposed system, is available on the web https://demos.iti.upv.es/hi/. (C) 2015 Elsevier B.V. All rights reserved.Navarro Cerdan, JR.; Llobet Azpitarte, R.; Arlandis, J.; Perez-Cortes, J. (2016). Composition of Constraint, Hypothesis and Error Models to improve interaction in Human-Machine Interfaces. Information Fusion. 29:1-13. doi:10.1016/j.inffus.2015.09.001S1132

Crossref

RiuNet

Efficient search in hidden text of large DjVu documents

Author: J.S. Bień
S. Pletschacher
T. Breuel
T. Piotrowski
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2010
Field of study

The paper describes an open-source tool which allows to present end-users with results of advanced language technologies. It relies on the DjVu format, which for some applications is still superior to other modern formats including PDF/A. The DjVu GPLed tools are not limited just to the DjVuLibre library, but are being supplemented by various new programs, such as pdf2djvu developed by Jakub Wilk. It allows in particular to convert to DjVu the PDF output of popular OCR programs like FineReader preserving the hidden text layer and some other features. The tool in question has been conceived by the present author and consist of a modification of the Poliqarp corpus query tool, used for National Corpus of Polish; his ideas have been very succesfully implemented by Jakub Wilk. The new system, called here simply Poliqarp for DjVu, inherits from its origin not only the powerfull search facilities based on two-level regular expressions, but also the ability to represent low-level ambiguities and other linguistic phenomena. Although at present the tool is used mainly to facilitate access to the results of dirty OCR, it is ready to handle also more sophisticated output of linguistic technologies

CiteSeerX

Crossref

Biblioteka Cyfrowa KLF UW (Digital Library of the Formal Linguistics Department at the University of Warsaw)

Global modes in Kernel density estimation: RAST clustering

Author: Breuel T.
Wirjadi O.
Publication venue
Publication date: 01/01/2007
Field of study

Crossref

Fraunhofer-ePrints

Approximate separable 3D anisotropic Gauss filter

Author: Breuel T.
Wirjadi O.
Publication venue
Publication date: 01/01/2005
Field of study

Crossref

Fraunhofer-ePrints

IJDAR DOI 10.1007/s10032-011-0176-2 ORIGINAL PAPER

Author: Bukhari Faisal Shafait
F. Shafait
Syed Saqib
T. M. Breuel
Thomas M. Breuel
Publication venue
Publication date
Field of study

Coupled snakelets for curled text-line segmentation from warpe

CiteSeerX

Welcome from the program chairs: ICDAR 2011

Author: Breuel T.
Lopresti D.
Tan C.L.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

10.1109/ICDAR.2011.6Proceedings of the International Conference on Document Analysis and Recognition, ICDARxxix

Crossref

ScholarBank@NUS

Efficient search in hidden text of large DjVu documents

Author: J.S. Bień
S. Pletschacher
T. Breuel
T. Piotrowski
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2010
Field of study

CiteSeerX

Crossref

Biblioteka Cyfrowa KLF UW (Digital Library of the Formal Linguistics Department at the University of Warsaw)

Topic models for semantics-preserving video compression

Author: Breuel T.
Lampert C.
Ulges A.
Wanke J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Most state-of-the-art systems for content-based video understanding tasks require video content to be represented as collections of many low-level descriptors, e.g. as histograms of the color, texture or motion in local image regions. In order to preserve as much of the information contained in the original video as possible, these representations are typically high-dimensional, which conflicts with the aim for compact descriptors that would allow better efficiency and lower storage requirements. In this paper, we address the problem of semantic compression of video, i.e. the reduction of low-level descriptors to a small number of dimensions while preserving most of the semantic information. For this, we adapt topic models - which have previously been used as compact representations of still images - to take into account the temporal structure of a video, as well as multi-modal components such as motion information. Experiments on a large-scale collection of YouTube videos show that we can achieve a compression ratio of 20 : 1 compared to ordinary histogram representations and at least 2 : 1 compared to other dimensionality reduction techniques without significant loss of prediction accuracy. Also, improvements are demonstrated for our video-specific extensions modeling temporal structure and multiple modalities

CiteSeerX

MPG.PuRe